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Abstract 

We prove a central limit theorem for random sums of the form X^i^Ti where {Xi}i>\ is a stationary 
m— dependent process and A*'„ is a random index independent of {Xi}i>\. Our proof is a generalization 
of Chen and Shao's result for i.i.d. case and consequently we recover their result. Also a variation of 
a recent result of Shang on m— dependent sequences is obtained as a corollary. Examples on moving 
averages and descent processes are provided, and possible applications on non-parametric statistics are 
discussed. 

1 Introduction 

In the following, we analyze the asymptotic behavior of random sums of the form X^tTi -^i as n — > c«, where 
X[s are non-negative random variables that are stationary and m— dependent, and iV„ is a non-negative 
integer valued random variable independent of X[s. Limiting distributions of random sums of independent 
and identically distributed (i.i.d.) random sums are well studied. See [3], [TU], [T^ and the references therein. 
Asymptotic normality of deterministic sums of m— dependent random variables are also well known. See, 
for example, [2], [9] and [11]. To the best of author's knowledge, previous work on the case of random sums 
of the form -^i where XiS are dependent are limited to |13| where he works on to— dependent random 

variables and [1] where they investigate random variables that appear as a result of integrating a random 
field with respect to point processes. Our results here will be in the lines of [4] generalizing their result to 
the stationary to— dependent case. Throughout the way, we will also improve the results given in [13J. 

Let's now recall stationary and to— dependent processes. Let {Xi}i>i be a stochastic process and let 
Fx{Xi-^j^rm •■•i^ife+m) be the cumulative distribution function of the joint distribution of {^i}i>i at times 
ii + TO, ifc + TO. Then {Xi},iyi is said to be stationary if, for all k, for all to and for all ii, i^. 

Fx{Xi^+rn, Xi^+m) — Fx{Xi^ , Xi^^) 

holds. For more on stationary processes, see [14 . If we define the distance between two subsets of A and B 
of N by 

p{A,B) :=inf{|z-j| :ze AjeS}, 

then the sequence \Xi\c>\ is said to be m— dependent if {Xi^i € A} and {Xj,j G B} are independent 
whenever p{A, B) > m for A, B C N. 

An example of a stationary to— dependent process can be given by the moving averages process. Assume 
that {Ti}i>i is a sequence of i.i.d. random variables with finite mean /i and finite variance a^. Letting 
Xi = [Ti +Ti+i)/2, {Xi\i>i is a stationary 1-dependeiit process with E[Xi] = /i, Var{Xi) = and 
Cov{Xi,X2) ^ 

This paper is organized as follows: In the next section, we state our main results and compare them 
with previous approaches. In the third section, we give examples on moving averages and descent processes 
relating it to possible nonparametric tests where the number of observations is itself random. Proofs of the 
main results are given in Section 4 and we conclude the paper with a discussion of future directions. 

2 Main Results 

We start with two propositions. Proofs of these are standard and are given at the end of Sectional 
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Proposition 2.1. Let {Xi}i>i be a stationary m— dependent process with fi := E[Xi], := Var{Xi) < oo, 
aj C'ov{Xi, Xi-f-j). Then for any N > 1, we have 

CN \ m m 

i=i I j=i j=i 

where Tnj ^ HN > j + 1). 

Proposition 2.2. Let {Xi}i>i be as in Provosition \2.1\ Let Yi's be i.i.d. non-negative integer valued 
random variables with v := £[1^^], :~ Y ari^jj) < oo and assume that X^'s and Yi's are independent. 
Define = Y^'^=i ^i- Then we have 

^xA = n{va^ + 2iy^aj + fi'^r^) + a{m) 
1=1 / j=i 



where 

mm m m 

= E(2fcE"j(rfc,, - 1) - 2^ja,(rfej - 1)) - 2^ja, 

fc=0 j=l j=l j=l 

and Ffej- = > j + 1). In particular, ^^^^ — 5- as n — > oo. When X^s are also independent (i.e., m = 0), 
this reduces to 

Var = "(^'^^ + ^^^^)- 

In the following, we will be using — >d for convergence in distribution and —d for equality in distribution. 
Also iV(0, 1) and $ will denote a standard normal random variable and its cumulative distribution function, 
respectively. Now we are ready to present our main result. 

Theorem 2.3. Let {Xi}i>i be a non-negative stationary m— dependent process with jj, := E[Xi] > 0, 
0-2 := Var{Xi) > 0, Oj := Cov{Xi, Xi+j), + '2J2'JI^ Oj > and E\Xi\^ < oo. Let Yi's be i.i.d. non- 
negative integer valued random variables with v := ^\Y{\ > 0, := Var{Yi) > 0, E|Yi|'^ < oo and suppose 
that Xi 's and Yi 's are independent. Define iV„ = X]r=i ^i- Then 



>dN{0,l) (2.1) 



as n oo. 



Note that assumptions on y^'s hold, for example, when y^'s are non-degenerate i.i.d. Bernoulli random 
variables. This is one of the most natural cases as in that case we may consider X^^ITi -^i the sum of 
outcomes of a series of experiments, where each observation is blocked with a fixed probability independent 
of others. The main assumption on X'^s (others are non-degeneracy conditions) is a third moment condition. 

Since our proof is a direct generalization of Chen and Shao's result on i.i.d. case (which is the case with 
m = 0), we recover their result from [3]. 

Theorem 2.4. Let {Xi}i>i be i.i.d. random variables with fi := E[Xi] > 0, cr^ := Var{Xi) > 0, and 
assume that E|Xi|3 < oo. Let Y, 's be i.i.d. non-negative integer valued random variables with v :— E[li] > 0, 
:— Var{Yi) > 0, E|Yip < oo and assume that Xi's and Yi's are independent. Define Nn ~ X]r=i^*- 
Then for any n > 1, we have 



sup 

zeR 



where C is a constant independent of n 
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We will explain how the proof of Theorem 12 .31 also reveals Theorem l2.4l in Section S) We note that in the 
original statement of Chen and Shao's result, /i is allowed to be 0. We excluded this in our statement as the 
upper bound in (|2.2I) is oo when ^ = 0. 

Our final result will be a variation of the main theorem given in |13j about the asymptotics of random 
sums of m— dependent random variables. Namely, we have 



Theorem 2.5. Under the assumptions of Theorem 



>d N{0,1) 



(2.3) 



as n ^ oo. 



Remark 2.6. Indeed, as can be seen from the proof of Theorem 
the scaling is perturbed a little bit. More precisely, we have 



one can obtain convergence rates when 



sup 



< z 



$(z) 



< 



C 



for a universal constant C and for every n > 1 where (cr') = cr + 2 X]j=i %rAf„ ,j 
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Era ■ 



and 



3 Examples 

Example 3.1. (Moving averages) Assume that {Ti}i>i is a sequence of i.i.d. random variables with finite 
mean /i and finite variance ct^. Letting 

Ti+ Ti+i . 
= , I > 1, 

{Xi}i>i is a stationary 1-dependent process with E[Xi] — ii, Var{Xi) = and Cov{Xi, X2) = (T^/4. 

When /i > and cr^ > 0, we can apply Theorem 12.31 as long as the assumptions on Nn are satisfied (As 
noted above, they will be satisfied, for example, when F^'s are independent Bernoulli random variables with 
success probability p G (0, 1)). This discussion can be generalized to m— moving averages defined as 

+ + ••• + Ti+?ri-l . ^ -, »T 

Yi = , I > 1, m e N 

m 

in a straightforward way. 

Example 3.2. (Descent processes) A sequence of real numbers (ti)"^]^ is said to have a descent at position 
1 < fc < n — 1 if ife > tfe+i. Here we are interested in the descent process of a sequence of random variables. 
Statistics related to descents are often used in nonparametric statistics to test independence or correlation 
(For example, one uses the number of inversions in Kendall's tau statistic). See [7 for a brief introduction 
for this connection. Also see [6J to learn more about why these processes are important. 

Now let Ti's be i.i.d. random variables with distribution F, and Xi := l{Ti > T^+i). Also let F/s be i.i.d. 
Bernoulli random variables with parameter p S (0, 1) and set 7V„ = X^i^Ti Yi- Defining 

N„-l 
J = l 

Wn is the number of descents in the random length sequence (Ti,T2, ...,r/v„). 

Here {Xiji^i is a stationary 1-dependent process and it is easy to check that /i = 1/2, tr^ = 1/4 and a'^+ 
2ai — 1/12. So assumptions of Theorem 12.31 are satisfied and we obtain the asymptotic normality of Wn- 
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Example 3.3. (Non-parametric statistics ) In this example, we discuss a possible application of Theorem 
12.31 in non-parametric statistics. Let Ti, ...,T„ be the random outcomes of an experiment and assume that 
the probability of observing any of these is p € (0, 1) independent of each other. Let Nn be the number of 
actually observed outcomes and Oi, On„ be the corresponding sequence of observations. 
Suppose we want to test 

Hq : Ti, ...jTn are uncorrelated and p = pq. 
Then one can use the test statistic 

1=1 

and Theorem 12.31 to understand the asymptotic distribution of Wn under the null hypothesis. A very large 
or a very small value for this statistics will provide information about the dependence structure of T/s. 
Extensions of this observation to more general tests will be followed in a subsequent work. 



4 Proofs 

We start by recalling two results that will be useful in the proof of the main theorem. First of these is a 
central limit theorem for m— dependent random variables established in [bj. 

Theorem 4.1. /5/ If {Xi}i>i is a sequence of zero mean m— dependent random variables and W = X]r=i -^i' 
then for all p G (2,3], 

n 

sup \¥(W <z)- $(z)| < 75(10m -I- 1)^-^ V E\Xi\P. 

The second result we will need is the following theorem of Chen and Shao ([T) on the normal approxima- 
tion of random variables. We note that this theorem is part of what is known as the concentration inequality 
approach in Stein method literature. See the cited paper or [3J for more on this. 

Theorem 4.2. ^ Let ^i, f„ be independent mean zero random variables for i = 1, ...,n with'^"^^Var{^i) — 
1. Let W = X]i=i T = W + A, and also for each i = 1, let Ai be a random variable such that 

and {W — ^i, Ai) are independent. Then we have 

n 

sup \¥{W <z)- $(z)| < 6.1(/32 + /Sa) + E|T^A| + VE|e.(A - A,)| (4.1) 

where 

n n 

/32 = ^E[Cf 1(161 >1)] and /Jg = J] E[|6|'l(|C.| < 1)]. 

1=1 i=l 

Before moving on to the proof of Theorem 12.31 '^e finally recall Prokhorov and Kolmogorov distances 
between probability measures. Let 7^(R) be the collection of all probability measures on (M, f8(E)) where 
*B(R) is the Borel sigma algebra on R. For a subset A C R, define the e— neighborhood of A by 

A' ■.^{peR:3qeA,dip,q)<e}^ [j B,{p) 

where B^{p) is the open ball of radius e centered at p. Then the Prokhorov metric dp : T'(M)^ — !• [0,oo) is 
defined by setting the distance between two probability measures /i and v to be 

dp{pL,v)-~mi{e>{): pl{A) <u{A'')+€ and z^(A) < ^(A^) + e, VA G «8(R)}. (4.2) 
The Kolmogorov distance dK between two probability measures ^ and v is defined to be 

dKipL, v) = sup |^((-oo, z\) - y{{-(X}, z])\. 



4 



The following two facts will be useful: (1) Convergence of measures in Prokhorov metric is equivalent 
to the weak convergence of measures. (2) Convergence in Kolmogorov distance implies convergence in 
distribution, but the converse is not true. See, for example. [14J for these standard results. 

Now we are ready to prove Theorem 12.31 We will follow the notations of |1J as much as possible. 



Proof of Theorem 12.31 : Let Z\ , Z2 and Z3 be independent standard normal random variables which 
are also independent of X^'s and Y^'s. Put 



h = 



\ 



Define 



where 



^ and H.,^^=^'''-^-^ 



y/nb 



with r 



(a')2=a2 + 2^a,rA,„,, 
l(^n > j + !)• Also write 



and 



Tn{Zl) 



H„ 



Zi 



y/nb ^/nb 
For n large enough, we have m < nv jl. For such n, we have 



(4.3) 



dK{Tn,Tn{Zi)) = dK{Hn,Zi) 

< P(|iV„ - ni^l > ni//2) +supE[E[|l(i?„ < z) - l{Zi < z)|l(|7V„ - ni'\ < ni//2)|iV„]] 



< 



4r2 



-E 



E 



CNnE\Xi\^l{\Nn - nv\ < nv/2) 



\ 3/2 

2 I o V~>™ 2 v-~>™ ■ \ 



(4.4) 



where for (|4.4p we used Chebyshev's inequality for the first estimate and Theorem 14.11 with p = 3 for 
the second estimate. Here the condition that m < nv/2 simplifies [a'Y as defined in (|4.3p to (cr')^ = 
+ 2 X^JLi — X^Jli J'^i) when |7V„ — nv\ < nv/2. Also note that throughout this proof, C will be 

a positive constant with not necessarily the same value in different lines. Now if X^jLi j'^j < 0, then the 
bound in (|4.4p yields 



dif(r„,T„(Zi)) < — ^ + ; 3/2 

as n — 00. Else if X]J=i J% > 0, we observe that for large enough n, we have (T'^+2 X^jli ^i=i -''^J 

by our assumption that ct^ + 2 X^JLi 0- ^or such n, using the bound in (|4.4p we obtain 

rfif (r„,T„(Zi)) < — ^ (4.5) 



^/nvl2 (0-2 + 2 flj - Y.7=i >J 
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and this yields dK(T.n, T„(Zi)) — > as 71 — > cxi when X^jLi j^j — 0- 

Hence we conclude that dxiTm Tn{Zi)) — > as 71 —> 00 as long as > and cr^ + 2 X^jli % > ^- This 
in particular implies 

dp(T„,T„(Zi)) ^0 (4.6) 
as n — )• 00 where dp is the Prokhorov distance as defined in (14. 2p . 



Next let (a")' = {a' f + 2 , a,(l - r^„,,) + J^Jl^ jajT^^j so that 

771 



Note that u" is not random and introduce 



and 



^n\^l) - ^ 1=1 



One can easily check that Tn{Zi, Z2) is a standard normal random variable since Zi and Z2 are assumed 
to be independent. So if we can show that dp{Tn[Zi),T^{Zi)) — > and dp(T/j(Zi), r„(Zi, Z2)) — ^ as 
n — (X), then the result will follow from an application of triangle inequality. We start by showing that 
d'p{Tli{Zi),Tn{Zi^ Z2)) — > 0. For this purpose, we will use Chen-Shao's concentration inequality approach 
to get bounds in the Kolmogorov distance and to recover Chen and Shao's result on i.i.d. case (If we just 
wanted to show dp{T^^{Zi),Tn(Zi, Z2)) 0, then this could be done in a much easier way. See Remark l4.3p . 
The following argument is in a sense rewriting the corresponding proof in [4] with slight changes since the 
concentration approach is used on Nn which is in both problems a sum of independent random variables. 
For the sake of completeness, we include all details. 

Define the truncation x oi x GM.hy 

nv \i X < nv/2 
X li nv/2 < X < inv 12 

Znv 12 if a; > 2>nv/2 



and let 



where 



W = ^ and A = — 



y/riT y/uTfl 

Since Yi is independent of iV„ — 1^ for alH = 1, n, we can apply Theorem 14. 2 1 to W + A setting 



. V Nn - Y, + ly - y^cr" Zi . 
Ai = -= , I = 1 



, . . . , I 



(So = ■^777 Theorem 14.21 ) For the first term of the upper bound given in (|4.ip . we have 



6.1(/32 + /33) < 6.1(2n)E 
For the second term in (|4.ip . we have 



^ CnE\Y,\-' ^ CE\Y,\' 

- (nr2)3/2 T^V^ ■ ^ ' ' 
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|VKA| = E|Zi|E 



E\W{^jNn - 



E\Zi\a' 



■E 



W- 



Nn — nv 



where we used the identity \/x— ^/y — J^^^ in the second equality. So by an application of Cauchy-Schwarz 
inequality, we obtain 



E|M^A| < -5^(E|l¥ni/2 I E 



Nr, — nv 



1/2 



Ca" 

< 



N„ — nv 



1/2 



(4.8) 



since E[VF^] = 1 and Var{Nn) = nr'^. Also note that for the second inequality we used |Af„ — < \Nn~ni'\ 
which easily from the definition of the truncation. 
For the third term of the bound in (14.11). we have 



n n n 

^E|e.(A-A,)|<^(E|e.ni/2E(|A-A,p)V2 < J] -= 

4=1 i=l 1=1 ^ 



< n^lE 



< cv^ct" e 



2\ 1/2 



1/2 



N„ -Nn-Yi+iy 



< 



y^TniVNn + VNn -Yi+iy) 
Co" (E|ri 



1/2 



where we used E|^ip = 1/n, the identity -^a; — ^ = J^^^ ^^nd the inequality \x ~ x ~ y\ < \y\ 
We conclude 



5;^E|e.(A-A,)|-X^E 



j=i 



1=1 



1^ -i^ 



(A - A,) 



< 



Ca" 



(4.9) 



Using Theorem 14.21 we get 



sup|P(r;(Zi) <z)-P(r„(Zi,Z2) <z)| < P{\Nn-njy\>niy/2) 



supE[|E[l(T;;(Zi) <z)- l(r„(Zi, Z2) < z)]l(|iV„ - ni.| < W2)|iV„ 



< sup|P(iy + A < z) -P(Z3 < z)| 
4r2 „ / ^t" 



< 



C 



where for the last step we combined the three estimates given in (14. 7p . (|4.8[) and (|4.9[) . Thus, 

dp(r,;(Zi),T„(Zi,Z2)) -^0 

as n CX3 if i^, r, /X > 0. 



(4.10) 



(4.11) 
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Finally we need to show that dp(T„(Zi), T^(Zi)) — > 0. First observe that 

^/nb 

almost surely as n — cxd. Also we know that T^(Zi) converges in distribution to T„(Zi,Z2). Thus, using 
Slutsky's theorem we conclude that T„(2'i) = T^^{Zi) + Tn{Zi) — T'^{Zi) also converges in distribution to 
r„(Zi,Z2). Hence 

dp(r„(Zi),r;(Zi)) <dp(T„(Zi),T„(Zi,Zi)) + dp(r„(Zi,Z2),r^(Zi)) ^0 (4.12) 

as n — > 00. 

Hence combining PTB)) . (|4.11l) and (|4.12p . we obtain 

dp(T„,r„(Zi,Z2)) < dp(T„,r„(Zi)) + dp(r„(Zi),T^(Zi)) + dp(T;:(Zi),r„(Zi,Z2)) ^ 

as n — > c» under the given assumptions and result follows. □ 

Remark 4.3. We can show that dp{Tl^{Zi),Tn{Zi, Z2)) — > easily if we are not interested in convergence 
rates. To see this, note that we can write T^{Zi) as 



n h \ j N„ a" 

n b 

Now by the strong law of large numbers — >■ v a.s. and by the standard central limit theorem for 
independent random variables -'^^"'^ — >• Z where Z is a standard normal random variable independent of 
Zi. Using Slutsky's theorem twice with these observations immediately reveals that r^(Zi) converges in 
distribution to a standard normal random variable. 

Proof of Theorem 12.41 : First note that under independence, we have aj = for j — 1, m so that 
a' = a" = a. Following the proof of Theorem l2.31 this implies that dK{Tn{Zi), T^(Zi)) = for every n. Now 
the result follows from the estimates of (ii<-(T„, r„(Zi)) and d/f (T„(Zi), T„(Zi, Z2)) by substituting aj = 
for j = I, ...,m. □ 

Proof of Corollary 12.51 : In the proof of Theorem 12.31 we showed that 

dK{Hn,Zi) = dK{Tn,T„{Zi)) ^ 

where iJ„ = and {a')^ = + 2 J2"l^ ajTjv„ j - ^ Ejli J^j- Since °. ^ ^ 

result follows from Slutsky's theorem. □ 
Finally we give the proofs of the variance formulas given in Proposition 12.11 and 12.21 

Proof of Proposition [271] : We have 

CN \ N 

Y^xA = ^l/ar(X,)+2 Cov{X,,Xj) 
1=1 / i=l l<i<j<N 

m 

= Na^ + 2j2i^ - j)<^jMN > j + 1) 

Rearranging terms, we obtain 

N 

Var ( ) =iV I cr^ + 2^ajl(7V> J + 1) 1 - 2 ^ ja^ 1(7V > j + 1) 
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by which the variance formula fohows. 



□ 



E 



Proof of Proposition 12.21 : First note that assumptions of Wald's identity are satisfied and so 



= nvii. Using this, we get 



Var 



\i=l J 

^ E ( ^ Xi - niyfi ] P{Nn = fc) + ^ E [ ^ X, - nt/^ ] P(iV„ = k) 

k=m+l \i=l ) k=0 \i=l ) 

where for the second equality we conditioned on iV„ which is independent of Xj's. Next note that we have 



E 



— kfj, and E j Xi ] = ka^ + 2k o,j^j,k — 2 jajTj^k + k 

/ 3 = 1 J = l 



(4.13) 



with r^jt = \{j > fc + 1). Thus, using Proposition 12 . II and (I4.13p . and doing some elementary manipulations, 
we obtain 



k—7n-\-l 
ni 

EE 



k—m+1 1 
' k 



k=0 





" k 




j - 2nly^l ^E 




\ 1 2 2 2 




.i=l 





= k) 



= E [VariJ2x^+{k^l^nly^l)Av{N^^k) 

k=m+l \ \i=l / / 

m m m 

+ ^(fccr^ + 2k'YajTj,k -~2'YjajTj,k + k'^ fJ.'^ - 2nv^?k + n^v"^ ^?)¥{Nn = k) 
k=0 j = l j=i 

Noting that for A; > to + 1, Var (^X)i=i -^ij = k (^a^ + 2 J^'JLi ^ 2 J^'JLi j'^h we get 



Varl^xA = Yika"^ + 2kYaj ~2'Yiaj+k^n'^ -2knv^? + n^v'^n'^)^{Nn^k) 



fc=0 



^^(/ccr^ + 2k ajTj_k — 2 jajVj k + y^k^ — 2ni'iJ?k + n^iy^^^ 

3=1 3=1 



fc=0 



- fccr^ - 2A: ^ flj + 2 ^ juj - k^^? + 2knvv^ - n^v^^?)V{Nn = k). 

3 = 1 3 = 1 

After some cancelations and using the values for E[A^„] = nv and E[iV^] = nr^ + ri^v^ , we finally arrive 



at 



Var E^M " "(^^^ + '^^Y'^3 + t^^'^^) + "('^) 



i=l 



3 = 1 



where 



The assertion that 



m m 



.(to) - ^(2fc^a,(rfe,, - 1) -2Y,ja,{Vk., - 1)) - 2^ 



k=0 3 = 1 



3 = 1 



3 = 1 



a(m) 



as n — >■ oo follows from the fact that all the variables are bounded. 
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□ 



5 Conclusion 



In this paper, we established a central limit theorem for random sums of stationary m— dependent processes. 
Our proof is an extension of the argument given in U for the i.i.d. case and this enables to recover their 
result. At the same time, we were able to give variations of the results in [I3j. In the subsequent research 
we are planning to (1) obtain convergence rates for Theorem 12.31 (2) relax the m— dependence condition 
to a weak local dependence condition (For such conditions, see [5J), (3) adapt the size biasing technique 
often used in normal approximation to the case of random sums (See, for example, 0) and (4) find more 
applications on non-parametric statistics. 
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